AITopics | deep learning system

Collaborating Authors

deep learning system

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Representing Beauty: Towards a Participatory but Objective Latent Aesthetics

Rusnak, Alexander Michael

arXiv.org Artificial IntelligenceOct-6-2025

What does it mean for a machine to recognize beauty? While beauty remains a culturally and experientially compelling but philosophically elusive concept, deep learning systems increasingly appear capable of modeling aesthetic judgment. In this paper, we explore the capacity of neural networks to represent beauty despite the immense formal diversity of objects for which the term applies. By drawing on recent work on cross-model representational convergence, we show how aesthetic content produces more similar and aligned representations between models which have been trained on distinct data and modalities - while unaesthetic images do not produce more aligned representations. This finding implies that the formal structure of beautiful images has a realist basis - rather than only as a reflection of socially constructed values. Furthermore, we propose that these realist representations exist because of a joint grounding of aesthetic form in physical and cultural substance. We argue that human perceptual and creative acts play a central role in shaping these the latent spaces of deep learning systems, but that a realist basis for aesthetics shows that machines are not mere creative parrots but can produce novel creative insights from the unique vantage point of scale. Our findings suggest that human-machine co-creation is not merely possible, but foundational - with beauty serving as a teleological attractor in both cultural production and machine perception.

artificial intelligence, machine learning, representation, (18 more...)

arXiv.org Artificial Intelligence

2510.02869

Country: Europe > Austria (0.28)

Genre: Research Report > New Finding (0.54)

Industry: Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Development of a Neural Network Model for Currency Detection to aid visually impaired people in Nigeria

Nwokoye, Sochukwuma, Moru, Desmond

arXiv.org Artificial IntelligenceAug-26-2025

Neural networks in assistive technology for visually impaired leverage artificial intelligence's capacity to recognize patterns in complex data. They are used for converting visual data into auditory or tactile representations, helping the visually impaired understand their surroundings. The primary aim of this research is to explore the potential of artificial neural networks to facilitate the differentiation of various forms of cash for individuals with visual impairments. In this study, we built a custom dataset of 3,468 images, which was subsequently used to train an SSD neural network model. The proposed system can accurately identify Nigerian cash, thereby streamlining commercial transactions. The performance of the system in terms of accuracy was assessed, and the Mean Average Precision score was over 90%. We believe that our system has the potential to make a substantial contribution to the field of assistive technology while also improving the quality of life of visually challenged persons in Nigeria and beyond.

artificial intelligence, machine learning, nigeria, (16 more...)

arXiv.org Artificial Intelligence

2508.18012

Country: Africa > Nigeria (0.64)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Ophthalmology/Optometry (0.51)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.53)

Add feedback

AXLearn: Modular Large Model Training on Heterogeneous Infrastructure

Lee, Mark, Gunter, Tom, Lan, Chang, Peebles, John, Zhou, Hanzhi, Zou, Kelvin, Bangalore, Sneha, Chiu, Chung-Cheng, Du, Nan, Du, Xianzhi, Dufter, Philipp, Hou, Ruixuan, Huang, Haoshuo, Hwang, Dongseong, Kong, Xiang, Lei, Jinhao, Lei, Tao, Li, Meng, Li, Li, Lu, Jiarui, Lu, Zhiyun, Ma, Yiping, Qiu, David, Rathod, Vivek, Tong, Senyu, Tu, Zhucheng, Wang, Jianyu, Wang, Yongqiang, Wang, Zirui, Weers, Floris, Wiseman, Sam, Yin, Guoli, Zhang, Bowen, Zhou, Xiyou, Zhuo, Danyang, Leong, Cheng, Pang, Ruoming

arXiv.org Artificial IntelligenceJul-11-2025

We design and implement AXLearn, a production deep learning system that facilitates scalable and high-performance training of large deep learning models. Compared to other state-of-the-art deep learning systems, AXLearn has a unique focus on modularity and support for heterogeneous hardware infrastructure. AXLearn's internal interfaces between software components follow strict encapsulation, allowing different components to be assembled to facilitate rapid model development and experimentation on heterogeneous compute infrastructure. We introduce a novel method of quantifying modularity via Lines-of-Code (LoC)-complexity, which demonstrates how our system maintains constant complexity as we scale the components in the system, compared to linear or quadratic complexity in other systems. This allows integrating features such as Rotary Position Embeddings (RoPE) into AXLearn across hundred of modules with just 10 lines of code, compared to hundreds as required in other systems. At the same time, AXLearn maintains equivalent performance compared to state-of-the-art training systems. Finally, we share our experience in the development and operation of AXLearn.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2507.05411

Country: North America > United States > California (0.46)

Genre: Research Report (0.70)

Industry: Information Technology > Services (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Less is More: Efficient Weight Farcasting with 1-Layer Neural Network

Shou, Xiao, Bhattacharjya, Debarun, Ding, Yanna, Zhao, Chen, Li, Rui, Gao, Jianxi

arXiv.org Artificial IntelligenceMay-6-2025

Addressing the computational challenges inherent in training large-scale deep neural networks remains a critical endeavor in contemporary machine learning research. While previous efforts have focused on enhancing training efficiency through techniques such as gradient descent with momentum, learning rate scheduling, and weight regularization, the demand for further innovation continues to burgeon as model sizes keep expanding. In this study, we introduce a novel framework which diverges from conventional approaches by leveraging long-term time series forecasting techniques. Our method capitalizes solely on initial and final weight values, offering a streamlined alternative for complex model architectures. We also introduce a novel regularizer that is tailored to enhance the forecasting performance of our approach. Empirical evaluations conducted on synthetic weight sequences and real-world deep learning architectures, including the prominent large language model DistilBERT, demonstrate the superiority of our method in terms of forecasting accuracy and computational efficiency. Notably, our framework showcases improved performance while requiring minimal additional computational overhead, thus presenting a promising avenue for accelerating the training process across diverse tasks and architectures.

artificial intelligence, machine learning, neural network, (13 more...)

arXiv.org Artificial Intelligence

2505.02714

Country: North America > United States (0.68)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

QiMeng-Xpiler: Transcompiling Tensor Programs for Deep Learning Systems with a Neural-Symbolic Approach

Dong, Shouyang, Wen, Yuanbo, Bi, Jun, Huang, Di, Guo, Jiaming, Xu, Jianxing, Xu, Ruibai, Song, Xinkai, Hao, Yifan, Zhou, Xuehai, Chen, Tianshi, Guo, Qi, Chen, Yunji

arXiv.org Artificial IntelligenceMay-6-2025

Heterogeneous deep learning systems (DLS) such as GPUs and ASICs have been widely deployed in industrial data centers, which requires to develop multiple low-level tensor programs for different platforms. An attractive solution to relieve the programming burden is to transcompile the legacy code of one platform to others. However, current transcompilation techniques struggle with either tremendous manual efforts or functional incorrectness, rendering "Write Once, Run Anywhere" of tensor programs an open question. We propose a novel transcompiler, i.e., QiMeng-Xpiler, for automatically translating tensor programs across DLS via both large language models (LLMs) and symbolic program synthesis, i.e., neural-symbolic synthesis. The key insight is leveraging the powerful code generation ability of LLM to make costly search-based symbolic synthesis computationally tractable. Concretely, we propose multiple LLM-assisted compilation passes via pre-defined meta-prompts for program transformation. During each program transformation, efficient symbolic program synthesis is employed to repair incorrect code snippets with a limited scale. To attain high performance, we propose a hierarchical auto-tuning approach to systematically explore both the parameters and sequences of transformation passes. Experiments on 4 DLS with distinct programming interfaces, i.e., Intel DL Boost with VNNI, NVIDIA GPU with CUDA, AMD MI with HIP, and Cambricon MLU with BANG, demonstrate that QiMeng-Xpiler correctly translates different tensor programs at the accuracy of 95% on average, and the performance of translated programs achieves up to 2.0x over vendor-provided manually-optimized libraries. As a result, the programming productivity of DLS is improved by up to 96.0x via transcompiling legacy tensor programs.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2505.02146

Country: Europe > Austria (0.28)

Genre: Research Report (0.82)

Industry: Information Technology > Services (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Implementing An Artificial Quantum Perceptron

Hathidara, Ashutosh, Pandey, Lalit

arXiv.org Artificial IntelligenceDec-2-2024

A Perceptron is a fundamental building block of a neural network. The flexibility and scalability of perceptron make it ubiquitous in building intelligent systems. Studies have shown the efficacy of a single neuron in making intelligent decisions. Here, we examined and compared two perceptrons with distinct mechanisms, and developed a quantum version of one of those perceptrons. As a part of this modeling, we implemented the quantum circuit for an artificial perception, generated a dataset, and simulated the training. Through these experiments, we show that there is an exponential growth advantage and test different qubit versions. Our findings show that this quantum model of an individual perceptron can be used as a pattern classifier. For the second type of model, we provide an understanding to design and simulate a spike-dependent quantum perceptron. Our code is available at \url{https://github.com/ashutosh1919/quantum-perceptron}

artificial intelligence, machine learning, perceptron, (15 more...)

arXiv.org Artificial Intelligence

2412.02083

Country: North America > United States > Indiana (0.05)

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (1.00)

Add feedback

Reviews: Hybrid Reward Architecture for Reinforcement Learning

Neural Information Processing SystemsOct-7-2024, 13:53:47 GMT

R5: Summary: This paper builds on the basic idea of the Horde architecture: learning many value functions in parallel with off-policy reinforcement learning. This paper shows that learning many value functions in parallel improves the performance on a single main task. The novelty here lies in a particular strategy for generating many different reward functions and how to combine them to generate behavior. The results show large improvements in performance in an illustrative grid world and Miss Pac-man. Decision: This paper is difficult to access.

hybrid reward architecture, representation, value function, (8 more...)

Neural Information Processing Systems

Genre: Research Report (0.36)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.63)

Add feedback

Reviews: Toddler-Inspired Visual Object Learning

Neural Information Processing SystemsOct-7-2024, 09:25:58 GMT

The goal of the paper is to "data mine" records of toddlers' and their mothers' fixations while playing with a set of 24 toys in order to observe what might be good training data for a deep network, given a fixed training budget. The idea is that the toddler is the best visual learning system we know, and so the data that toddlers learn from should give us a clue about what data is appropriate for deep learning. They take fixation records extracted from toddlers (16-24 mo old) and their mothers collected via scene cameras and eye tracking to examine the data distribution of infants' visual input or mothers' visual input. This study clearly falls under the cognitive science umbrella at NIPS, although they try to make it about deep learning. For example, if they only cared about deep learning, they would not use a retinal filter. First, they manually collect data recording what toys the infants and mothers are fixating on (ignoring other fixations).

deep learning, learning, toddler, (13 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.82)

Add feedback

Contexts Matter: An Empirical Study on Contextual Influence in Fairness Testing for Deep Learning Systems

Du, Chengwen, Chen, Tao

arXiv.org Artificial IntelligenceAug-12-2024

Background: Fairness testing for deep learning systems has been becoming increasingly important. However, much work assumes perfect context and conditions from the other parts: well-tuned hyperparameters for accuracy; rectified bias in data, and mitigated bias in the labeling. Yet, these are often difficult to achieve in practice due to their resource-/labour-intensive nature. Aims: In this paper, we aim to understand how varying contexts affect fairness testing outcomes. Method:We conduct an extensive empirical study, which covers $10,800$ cases, to investigate how contexts can change the fairness testing result at the model level against the existing assumptions. We also study why the outcomes were observed from the lens of correlation/fitness landscape analysis. Results: Our results show that different context types and settings generally lead to a significant impact on the testing, which is mainly caused by the shifts of the fitness landscape under varying contexts. Conclusions: Our findings provide key insights for practitioners to evaluate the test generators and hint at future research directions.

fairness testing, generator, software engineering, (13 more...)

arXiv.org Artificial Intelligence

2408.06102

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.05)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
(7 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology (0.67)
Banking & Finance (0.67)
Education > Educational Setting (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Securing the Diagnosis of Medical Imaging: An In-depth Analysis of AI-Resistant Attacks

Biswas, Angona, Nasim, MD Abdullah Al, Gupta, Kishor Datta, George, Roy, Rashid, Abdur

arXiv.org Artificial IntelligenceAug-1-2024

Machine learning (ML) is a rapidly developing area of medicine that uses significant resources to apply computer science and statistics to medical issues. ML's proponents laud its capacity to handle vast, complicated, and erratic medical data. It's common knowledge that attackers might cause misclassification by deliberately creating inputs for machine learning classifiers. Research on adversarial examples has been extensively conducted in the field of computer vision applications. Healthcare systems are thought to be highly difficult because of the security and life-or-death considerations they include, and performance accuracy is very important. Recent arguments have suggested that adversarial attacks could be made against medical image analysis (MedIA) technologies because of the accompanying technology infrastructure and powerful financial incentives. Since the diagnosis will be the basis for important decisions, it is essential to assess how strong medical DNN tasks are against adversarial attacks. Simple adversarial attacks have been taken into account in several earlier studies. However, DNNs are susceptible to more risky and realistic attacks. The present paper covers recent proposed adversarial attack strategies against DNNs for medical imaging as well as countermeasures. In this study, we review current techniques for adversarial imaging attacks, detections. It also encompasses various facets of these techniques and offers suggestions for the robustness of neural networks to be improved in the future.

adversarial attack, adversarial example, neural network, (13 more...)

arXiv.org Artificial Intelligence

2408.00348

Country:

Asia > China > Guangdong Province > Shenzhen (0.04)
Asia > Bangladesh > Dhaka Division > Dhaka District > Dhaka (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Government > Military (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(2 more...)

Add feedback